FUNCTIONAL DATA ANALYSIS WITH INCREASING NUMBER OF 

PROJECTIONS 



STEFAN FREMDT, LAJOS HORVATH, PIOTR KOKOSZKA, AND JOSEF G. STEINEBACH 

Abstract. Functional principal components (FPC's) provide the most important and 
most extensively used tool for dimension reduction and inference for functional data. 
The selection of the number, d, of the FPC's to be used in a specific procedure has 
attracted a fair amount of attention, and a number of reasonably effective approaches 
exist. Intuitively, they assume that the functional data can be sufficiently well approxi- 
mated by a projection onto a finite-dimensional subspace, and the error resulting from 
such an approximation does not impact the conclusions. This has been shown to be a 
very effective approach, but it is desirable to understand the behavior of many inferen- 
tial procedures by considering the projections on subspaces spanned by an increasing 
number of the FPC's. Such an approach reflects more fully the infinite-dimensional 
nature of functional data, and allows to derive procedures which are fairly insensitive 
to the selection of d. This is accomplished by considering limits as — ^ oo with the 
sample size. 

We propose a specific framework in which we let (i — )■ oo by deriving a normal 
approximation for the partial sum process 

Idu] lNx\ 

H^^^J' 0<u<l, 0<a;<l, 
j=i i=i 

where N is the sample size and is the score of the ith function with respect to 
the jth FPC. Our approximation can be used to derive statistics that use segments of 
observations and segments of the FPC's. We apply our general results to derive two 
inferential procedures for the mean function: a change-point test and a two-sample test. 
In addition to the asymptotic theory, the tests are assessed through a small simulation 
study and a data example. 
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1. Introduction 

Functional data analysis has grown into a comprehensive and useful field of statistics 
which provides a convenient framework to handle some high-dimensional data struc- 



tures, including curves and images. The monograph of iRamsay and Silverman! (120051 ) 
has done a lot to introduce its ideas to the statistics community and beyond. Several 
other monographs and thousands of papers followed. This paper focuses on a specific 
aspect of the mathematical foundations of functional data analysis, which is however 
of fairly central importance. We first describe the contribution of this paper in broad 
terms, and provide some more detailed background and discussion in the latter part of 
this section. 

Perhaps the most important, and definitely the most commonly used, tool for dimension 
reduction of functional data is the principal component analysis. Suppose we observe a 
sample of functions, Xi, X2, . . . , Xn, and denote by 

Vij = j - XN{t)) Vj{t)dt, z = 1, 2, . . . , iV, J = 1, 2, . . . , rf, 

the scores of the Xi with respect to the estimated functional principal components vj. 
The scores fjij depend on two variables i and j, and to refiect the infinite-dimensional 
nature of the data, it may be desirable to consider asymptotics in which both and 
d increase. This paper establishes results that allow us to study the two-dimensional 
partial sum process 

ldu\ lNx\ 

^ ^ / {Xi{t) - jjxit)) Vj{t)dt, 0<M<1, 0<x<l. 
j=i i=i 

More specifically, we derive a uniform normal approximation and apply it to two prob- 
lems related to testing the null hypothesis that all observed curves have the same mean 
function. We obtain new test statistics in which the number of the functional princi- 
pal components, d, increases slowly with the sample size N. We hope that our general 
approach will be used to derive similar results in other settings. 

Statistical procedures for functional data which use functional principal components 
(FPC's) often depend on the number d of the components used to compute various sta- 
tistics. The selection of an optimal d has received a fair deal of attention. Commonly 
used approaches include the cumulative variance method, the scree plot, and several 
forms of cross-validation and pseudo information criteria. By now, most of these ap- 
proaches are implemented in several R packages and in the Matlab package PACE. A 
related direction of research has focused on the identification of the dimension d assum- 
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is concerned with functional data which cannot be reduced to finite-dimensional data in 
an obvious and easy way. Such data are typically characterized by a slow decay of the 
eigenvalues of the empirical covariance operator. Figure [T] shows the eigenvalues of the 
empirical covariance operator of the annual temperature curves obtained over the period 
1856-2011 in Melbourne, Australia, while Figure [2] shows the cumulative variance plot 
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Figure 1. Melbourne temperature data: eigenvalues A2, . . . , A49. 
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Figure 2. Melbourne temperature data: percentage of variance explained 
by the first k eigenvalues, i.e. fk = Yli=i Kl Yfj=i k ^ 1,2, . . . , 49. 
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for the same data set. It is seen that the eigenfunctions decay at a slow rate, and neither 
their visual inspection nor the analysis of cumulative variance provide a clear guidance 
on how to select d. This data set is analyzed in greater detail in Section [51 
In situations when the choice of d is difficult, two approaches seem reasonable. In the 
first approach, one can apply a test using several values of c? in a reasonable range. If the 
conclusion does not depend on d, we can be confident that it is correct. This approach 
has been used in applied research, see lGromenko et al.l (120121 ) for a recent analysis of this 
type. The second approach, would be to let d increase with the sample size N, and derive 
a test statistic based on the limit. In a sense, the second approach is a formalization of the 
first one because if a limit as d — )■ oo exists, then the conclusions should not depend on 
the choice of d, if it is reasonably large. In the FDA community there is a well grounded 
intuition that d should increase much slower than N, so asymptotically large d need not 
be very large in practice. It is also known that the rate at which d increases should 
depend on the manner in which the eigenvalues decay. We obtain specific conditions 
that formalize this intuition in the framewor k we consider . In m ore specific settings, 



contri butions in this directions were ma de by iCardot et al.l (120031 ) and iPanaretos et al 



(I2OIOI ). The work of ICardot et al.l (120031 ) is more closely related to our research: as part 
of the justification of their testing procedure, they establish conditions under which a 
limiting chi-square distribution with d d egrees of freedom can be approximated by a 
normal distribution as = d{N) — )■ 00. IPanaretos et al.l (I2OIOI ) are concerned with a 
test of the equality of the covariance operators in two samples of Gaussian curves. In the 
supplemental material, they derive asymptotics in which d is allowed to increase with 
the sample size. Our theory is geared toward testing the equality of mean functions, 
but we do not assume the normality of the functional observations, so we cannot use 
arguments that use the equivalence of independence and zero covariances. We develop 
a new technique based on the estimation of the Prokhorov-Levy distance between the 
underlying processes and the corresponding normal partial sums. 

The paper is organized as follows. In Section [21 we set the framework and state a general 
normal approximation result in Theorem 12.11 This result is then used in Sections [31 and 
m to derive, respectively, change-point and two-sample tests based on an increasing 
number of FPC's. Section [51 contains a small simulation study and an application to the 
annual Melbourne temperature curves. All proofs are collected in the appendices. 



2. Uniform normal approximation 

We consider functional observations Xi{t), t G X, i = 1,2, . . . , N, defined over a com- 
pact interval X. We can and shall assume without loss of generality that X = [0,1]. 
Throughout the paper, we use the notation J = and 

if, 9) = I f{t)g{t)dt, ll/ir = (/,/). 

All functions we consider will be elements of the Hilbert space of square integrable 
functions on [0, 1]. 
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In the testing problems that motivate this research, under the null hypothesis, the ob- 
servations follow the model 

(2.1) X,{t)=fi{t) + Zi{t), l<i<N, 

where EZi{t) = and fi{t) is the common mean. We impose the following standard 
assumptions. 

Assumption 2.1. Zi, Z2, . . . , Zn are independent and identically distributed. 
Assumption 2.2. J ^'^{t)dt < 00 and < 00. 

Under these assumptions, the covariance function 

c(t,s) =EZi(t)Zi(s), 

is square integrable on the unit square and therefore it has the representation 

00 

c(t,s) = y^^\kVk{t)vk{s), 

k=l 

where Ai > A2 > . . . are the eigenvalues and f 1, f 2, • • • are the orthonormal eigenfunctions 
of the covariance operator, i.e. they satisfy the integral equation 

(2.2) A,„,W^/c(M).,(.)^.. 

One of the most important dimension reduction techniques of functional data analysis 
is to project the observations Xi{t), . . . , Xiy{t) onto the space spanned by Wi, . . . , f^, the 
eigenfunctions associated with the d largest eigenvalues. Since the covariance function 
c, and therefore Vi, . . . ,Vd, are unknown, we use the empirical eigenfunctions Vi, . . . ,Vd 
and eigenvalues Ai > A2 > • • • > defined by 



(2.3) XjVjit) = J c^it, s)vj{s)ds, 

where 

1 ^ 

withXN{t) = N-'J2l^X,{t). 

In this section, we require only two more assumptions, namely 



Assumption 2.3. Ai > A2 > . . . 
Assumption 2.4. EHZip < 00. 

Assumption 12.31 is needed to ensure that the FPC's Vj are uniquely defined. In Theo- 
rem [211] it could, of course, be replaced by requiring only that the first d eigenvalues are 
positive and different, but since in the applications we let d — )■ cxd, we just assume that 
all eigenvalues are positive and distinct. If A^.+i = for some d*, then the observations 
are in the linear span of fi, . . . i.e. they are elements of a d*-dimensional space. 
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SO in this case we cannot consider d = d{N) — > oo. Assumption 12.31 means that the 
observations are in an infinite-dimensional space. Assumption 12.41 is weaker than the 
usual assumption i^HZiH"^ < oo. As will be seen in the proofs, subtle arguments of the 
probability theory in Banach spaces are needed to dispense with the fourth moment. 
To state the main result of this section, define 

^^ = • • • , ^^,df and = A7'/'(Z,, V,), l<t<N, l<j <d, 

where ■'^ denotes the transpose of vectors and matrices. Set 

(2.4) ^(3;) = __ ^ ^. ^, 0<x<l, l<j<d. 

1=1 

We now provide an approximation for the partial sum processes Sj^n^x) defined in fl2.4p 
with suitably constructed Wiener processes (standard Brownian motions). 

Theorem 2.1. If Assumptions IKT[ \2.3\ and \2.4\ hold, then for every N we can define 
independent Wiener processes Wi^n^ . . . , Wd,N such that 

(2.5) p| max sup \Sjn{x) - Wj^N{x)\ > Ari/2-i/8o| 

|^l<i<rf 0<a;<l ' J 

< cAT-Vso Li/12 i/x,) 1/Af I , 

L ^^=1 ^ j=i J 

where only depends on Ai and E'^Zi^^. 

The constant 1/80 in (12. 5 p is not crucial, it is a result of our calculations. Theorem 12.11 



is related to the results of Einmahl ( 119871 . Il989[ ) who obtained strong approximations for 



partial sums of independent and identically distributed random vectors with zero mean 
and with identity covariance matrix. In our setting, for any fixed rf, the covariance matrix 
is not the identity, but this is not the central difficulty. The main value of Theorem 12.11 
stems from the fact that it shows how the rate of the approximation depends on d\ no 



such information is contained in the work of Einmahl (119871 119891 ). who did not need to 
consider the dependence on d. The explicit dependence of the right hand side of (12. 5p on 
d is crucial in the applications presented in the following sections in which the dimension 
of the projection space depends on the sample size N . 

Very broadly speaking. Theorem 12.11 implies that in all reasonable statistics based on 
averaging the scores, even in those based on an increasing number of FPC's, the partial 
sums of scores can be replaced by Wiener processes to obtain a limit distribution. The 
right hand side of ( 12. 5 p allows us to derive assumptions on the eigenvalues required to 
obtain a specific result. Replacing the unobservable scores j by the sample scores r/jj 
is relatively easy. We will illustrate these ideas in Sections [3] and HI 

3. Change— point detection 

Over the past four decades, the investigation of the asymptotic properties of partial sum 
processes has to a large extent been motivated by change-point detection procedures, and 
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this is the most natural application of Theorem 12.11 The research on the change-point 
problem in v arious contexts is very exten sive, some aspects of the asymptotic theory are 
presented in ICsorgo and HoryathI (119971 ). Detection of a change in the mean function 
was studied by iBerkes et al.l ( l2009l ) who considered a procedure in which the number of 
the FPC's, d, was fixed, and the asymptotic distribution of the test statistic depended 
on d. We show in this section that it is possible to derive tests with a standard normal 
limiting distribution by allowing the d to depend on the sample size A^. 
We want to test whether the mean of the observations remained the same during the 
observation period, i.e. we test the null hypothesis 

Ho: EX,{-) = EX,{-) = --- = EXr,{-) 

("=" means equality in L^). Under the null hypothesis, the Xj follow model (12. ip in 
which /i(-) is an unknown common mean function under Hq. The alternative hypothesis 
is 



Ha : there is k* e [1,2, . . . , N) such that 
EX,{-) = --- = EXk4-)^EXk,+i{-) 

Under Ha the mean changes at an unknown time k*. 
To derive a new class of tests, we introduce the process 

[du] 



EXn{-). 



1 

dV2 



1 

N 



Sj{[Nx\) - xSj{N) -x{l-x)\, 0<u,x<l 



where 



i=i 

The process Z]\f{u,x) contains the cumulative sums Sj{[Nx\) — xSj{N) which measure 
the deviation of the partial sums from their "trend" under Hq, and a correction term 
x{l — x) needed to ensure convergence as — )■ oo. 

To obtain a limit which does not depend on any unknown quantities, we need to impose 
assumptions on the rate at which d = d{N) increases with N. Intuitively, the assump- 
tions below state that d is much smaller than the sample size N, the d largest eigenvalues 
are not too small, and that the difference between the consecutive eigenvalues tends to 
zero slowly. Very broadly speaking, these assumptions mean that the distribution of the 
observations must sufficiently fill the whole infinite-dimensional space L^. 

Assumption 3.1. d = d{N) — )■ oo 

(rflogX)i/2Ar-i/8o ^ 0, 

^1/12^-1/80 • 1 M . , 



Assumption 3.2. 
Assumption 3.3. 



Assumption 3.4. 



Eva.) 

d 

AT-l/SO^ 1/^3/2 
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Assumption 3.5. 



where Ci = -^2 — Ai, Ci = niin(Aj_i — Aj, Aj — Aj+i), j > 2. 

With these preparations, we can state the main result of this section. 

Theorem 3.1. If Assu'mptions l2JU2.3\ and \3.1\W^ are satisfied, then 

Zn{u,x) — )■ T{u,x) in ©[0, 1]^, 
where T{u,x) is a mean zero Gaussian process with 

E[T{u,x)T{v,y)] = 2ux'^{l -yf, Q<u<v<l, Q<x<y<l. 
One can verify by computing the covariance functions that 

(3.1) {V{u,x), 0<u,s< l} = {V2{l-xfW{u,x'^/{l-xf ), Q<u,x< 1}, 

where {iy(f , y),v,y > 0} is a bivariate Wiener process, i.e. W{v, y) is a Gaussian process 
with EW{y,y) = and E[W{v,y)W{v',y')] = min(f , t>') min(?/, ?/'). Representation 
( 13. ip means that continuous functionals of the process r(-,-) can be simulated with 
arbitrary precision, so Monte Carlo tests can be used. One would choose the number of 
projections in the CUSUM procedure such that the test would give the largest rejection 
if the alternative holds. The statistic max^max^. \Zj\f{u,x)\ is maximizing the CUSUM 
statistics maXx\ZN{k/d,x)\, where k = l,2,...,d projections are used. It is however 
possible to obtain a number of simple asymptotic tests by examining closer the structure 
of the process r(-, ■). We list some of them in Corollary 13.11 and we will see in Section |5] 
that the Cramer-von-Mises type tests have very good finite sample properties. Let B 
denote a Brownian bridge and define 

Ho = E i sup B'^{x) I and ctq = var I sup B'^{x] 

\0<a;<l / \0<a:<l 

Corollary 3.1. If the assumptions of Theorem \3.1\ are satisfied, then 

(3.2) ^ Ar(o,i), 

(3-3) y^.\i]^j(SANx\)-xS^^^^^ ^ iV(0,l), 

(3.4) (rf7^|„^<^P,E^(^^(LA^^J)-^^^^ 4 iV(0,l), 

where A^(0, 1) stands for a standard normal random variable. 

We conclude this section with two examples which show that Assumptions I3.2H3.5I hold 
under both power law and exponential decay of the eigenvalues. 
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Example 3.1. If the eigenvalues satisfy 



A, 



Cl 



+ 



a+1 



as J — )■ oo, 



with some Ci > 0, < C2 < 1 and a > 0, then Assumptions \3. Si\3.5\ hold if d/ {log N)'^ — )■ 
with some (3 > 0. 

Under the conditions of Example 3.1, one could choose d^ = 0{N'^), where ( depends 
on a. In case of a fixed sampUe size N, the power of the test would decrease if d^ is 
too large. Hence we recommend choosing dN ~ (logN)'^, where /3 > can be arbitrarily 
chosen. 

Example 3.2. // the eigenvalues satisfy 

Xj = cqc'"^ + o(e~"-'), as j — )• oo, 

with some cq > and a > 0, then Assumptions \3. ^3. 5\ hold if d/ (loglogN)'^ — )• with 
some (3 > 0. 



4. Two— sample problem 

The two-sample p roblem for functional data was perhaps first discussed in depth by 
Benko et al.l (120091 ) who were motivated by a problem related to implied volatility curves. 
It has recent ly attracted a f a ir ana ount of attenti on motivated by problem s arising in space 
physics, see iHorvath et al.l (120091 ) . genetics, see iPanaretos et al.l (l2010t ). and finance, see 
Horvath et al.l (120121 ). The above list does not include many other important contribu- 
tions. In its simplest, but most important form, it is about testing if curves obtained 
from two populations have the same mean functions. T he most direct approach, devel- 
oped into a bootstrap procedure by iBenko et al.l ((20091), is to look at the norm of the 
difference of the estimated mean functions. In this section, we show that the normal ap- 
proximation of Section [2] leads to an asymptotic test whose limit distribution is standard 
normal. 

Suppose we have two random samples of functions: Xi, . . . , Xjy and Yi, . . . , Ym- We 
assume the X sample satisfies (12. ip and Assumptions 12. H 12.21 and 12.41 Similarly, the Y 
sample is a location model given by 



(4.1) 



Yi{t) = fL,{t) + Qi{t), l<i<M, 



where /i*(t) is the common mean of the Y sample and EQi{t) = 0. As in the case of the 
X sample, the Y sample satisfies the following conditions: 

Assumption 4.1. Qi, Q2, . . . , Qm are independent and identically distributed. 

Assumption 4.2. / fil{t)dt < 00 and E\\Qi\\^ < 00. 

Assumption 14.21 yields that 

c,(t, s) = EQ,{t)Q,{s) 
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is a square integrable function on the unit square. 

In this section we are interested in testing the null hypothesis 

^0*: M-) = /i*(-)- 

The statistical inference to test Hq is based on the difference — Ym, where Xat and 
Ym denote the sample means. We assume 

Assumption 4.3. 

^ = A + 0(iV-i/^) as min(M, A^) ^ oo 

with some < A < oo. 

Now we define the pooled covariance function 

cp(t, s) = c(t, s) + Ac*(t, s). 

Since Cp(t,s) is a positive-definite, symmetric, square integrable function, there are real 
numbers Ki > /t2 > . . . and orthonormal functions Ui,U2, ■ ■ ■ satisfying 

KiUi{t) = j cp{t,s)ui{s)ds, i = l,2, 

We wish to project Xn — Ym into the space spanned by -ui, . . . , Ud, where d = d{N) — )■ oo, 
so similarly to Assumption 12.31 we require 

Assumption 4.4. ki > K2 > > . . . 
Assumption 4.5. 

/ d \3/8 
^-3/32^1/4 / y\ _^ Q 

Our test statistic is 

d 

Dn,m = N{Xn - Ym, Uifl Ki. 

As in Section ini we need additional assumptions balancing the rate of growth oid = d{N) 
and the rate of decay of the ki and the differences between them. 

Assumption 4.6. 

where Li = ^2 — ki, li = min(i£_i — t^, li — i^+i), i > 2. 

Since Ui,U2, ■ ■ ■ are unknown, we replace them with the corresponding empirical eigen- 
functions ui,U2, ■ ■ ■ defined by the integral operator 

kiUi{t) = I Cp{t,s)ui{s)ds, z = l,2,.... 
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a 0.01 0.05 0.10 

0.109256 0.0726292 0.0578267 

Table 5.1. Critical values for the distribution of (15. 3p . 
where Ki > K2 > • • • and 

N 

Cp(t, s) = CAr(t, s) + —C^Mit, s), 



with 



1 ^ 

i*M{t, s) = - J2{Ye{t) - YMmYiis) - Ym{s)). 



M 



The empirical version of D^^m is 

d 



i=l 



Theorem 4.1. IfH^, Assumptions\K^\23 and lJJl^JT^ hold, th 



len 



{2d)-^'\DN,M-d) ^ iV(0,l), 
where A^(0, 1) stands for a standard normal random variable. 

5. A small simulation study and a data example 

The main contribution of this paper lies in the statistical theory, but it is of interest to 
check if the new tests derived in Sections [3] and H] perform well in finite samples. We 
report the results for the test based on Theorem 13.11 in some detail, as it utilizes the 
convergence of the two-parameter process in full force, and such an approach has not 
been used before. We also comment on the tests based on Corollary 13.11 and Theorem 14. 1[ 
We conclude this section with an illustrative data example. 

The simulated data which satisfy the null hypotheses of Sections [3] and H] are generated 
as independent Brownian motions on the interval [0, 1]. We generate them by using iid 
normal increments on 1,000 equispaced points in [0,1] (random walk approximation). 
(Example 13.11 shows that for the Brownian motion the assumptions of Theorem 13.11 are 
satisfied.) Alternatives are obtained by adding the curve at{l — t) after a change-point 
or to the observations in the second sample. The parameter a regulates the size of the 
change or the difference in the means in two samples. 

Many tests can be obtained from Theorem 13.11 by applying functionals continuous on 
P[0, If. It is not our objective to provide a systematic comparison, we consider only the 
test based on the weak convergence 

(5.1) / / Zjf{u,x)dudx T^{u, x)dudx. 

Jo Jo Jo Jo 
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0.070 


[0.0567,0.0833] 


4 


0.049 


[0 


0378,0 


0602] 


0.075 


[0.0613,0.0887] 


5 


0.053 


[0 


0413,0 


0647] 


0.076 


[0.0622,0.0898] 


6 


0.057 


[0 


0449,0 


0691] 


0.085 


[0.0705,0.0995 


7 


0.057 


[0 


0449,0 


0691] 


0.085 


[0.0705,0.0995] 


8 


0.053 


[0 


0413,0 


0647] 


0.085 


[0.0705,0.0995] 


9 


0.051 


[0 


0396,0 


0624] 


0.083 


[0.0687,0.0973] 


10 


0.051 


[0 


0378,0 


0602] 


0.081 


[0.0668,0.0952] 


11 


0.054 


[0 


0496,0 


0624] 


0.083 


[0.0687,0.0973] 


12 


0.052 


[0 


0405,0 


0635] 


0.086 


[0.0714,0.1006] 


13 


0.050 


[0 


0387,0 


0613] 


0.087 


[0.0723,0.1017] 


14 


0.054 


[0 


0422,0 


0658] 


0.086 


[0.0714,0.1006] 


15 


0.052 


[0 


0405,0 


0635] 


0.079 


[0.0650,0.0930] 



Table 5.2. Empirical sizes and 90% confidence intervals for the proba- 
bility of rejection for the change-point test based on convergence (15. ip . 



FUNCTIONAL DATA ANALYSIS WITH INCREASING NUMBER OF PROJECTIONS 



13 




Figure 3. Left panel: 20 realizations of the Brownian motion; Right 
panel: independent 20 realizations of the Brownian motion with the curve 
at{l — t),a = 1.5 added. 



To compute the critical values, we use the following representation of the limit 



(5.2) 



Jo 



r^('U, x)dudx 



V 



l<k,e<cx> 



In (15. 2p . the = (vr(A; — 1/2) are the eigenvalues of the Wiener process, the z/£ are 
the eigenvalues of the covariance operator with kernel 2(min(s,t) — st)^, and {Nk/} is 
an array of independent standard normal random variables. The critical values were 
determined for a truncated version of the right-hand side of (15. 2p with truncation level 
49, i.e. for 



(5.3) 



l<fc,£<49 



Since the eigenvalues ue are difficul t to determine e xplici tly, they were calculated numer- 
ically using the R package fda, cf lRamsay et al.l (120091 ). The simulated critical values 
based on 100,000 replications of (15.31) are provided in Table 15.11 

Table [5^ shows the empirical sizes p, i.e. the fraction of rejections, as well as asymptotic 
90% confidence intervals 



(5.4) 



p- 1.654 



R 



p+ 1.654 



Pi^-P) 
R 
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N = 100 



d 


a : 


= 1 


a = 


1.5 


a = 0.05 


a = 0.10 


a = 0.05 


a = 0.10 


9 


U. iOo 


n 1 Q9 


u.ooo 


n '?Q8 
u.oyo 


Q 
O 


n /I 


U.Oi ( 


n 81 Q 
u.oiy 


U.oOi 


A 


U.OUi 


U.004I: 


U.o^to 


n 87^; 

U.c5 ( 







n KRA 


U.oOO 


n 887 
U.oo ( 


a 
D 


U.^toi 


u.ooz 


n 8/17 

U.o4 ( 


n 88'? 
u.ooo 


7 






n 8/1'? 


n 881 

U.ooi 


Q 
O 


U. 41:00 


u.oou 


U.004 


n 87/1 

U.O ( 41: 


Q 




n c;i Q 
u.oiy 


n 89'? 

U.OZO 


n 87n 

U.O ( u 


1 n 


U.40O 


U.0U4 


U.oi/ 


u.ooy 


11 


0.441 


0.501 


0.802 


0.853 


12 


0.431 


0.496 


0.793 


0.844 


13 


0.420 


0.484 


0.791 


0.834 


14 


0.400 


0.472 


0.782 


0.822 


15 


0.388 


0.467 


0.767 


0.817 



= 200 



a = 1 a = 1.5 



d 


a = 


= 0.05 


a = 0.10 


a = 0.05 


a = 0.10 


2 


0. 


,327 


0.370 


0.620 


0.660 


3 


0. 


,784 


0.814 


0.984 


0.991 


4 


0, 


,808 


0.849 


0.988 


0.994 


5 


0. 


,823 


0.860 


0.992 


0.994 


6 


0. 


,825 


0.863 


0.991 


0.996 


7 


0. 


,819 


0.864 


0.992 


0.994 


8 


0. 


,814 


0.859 


0.990 


0.994 


9 


0. 


,802 


0.846 


0.990 


0.993 


10 


0. 


,791 


0.837 


0.990 


0.993 


11 


0. 


,766 


0.830 


0.988 


0.992 


12 


0. 


,754 


0.821 


0.987 


0.992 


13 


0. 


,740 


0.800 


0.987 


0.991 


14 


0. 


,734 


0.794 


0.987 


0.991 


15 


0, 


,726 


0.787 


0.986 


0.990 



Table 5.3. Power of the test based on convergence (15. ip . The change- 
point is at k* = [N/2\ . 
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for the probability p of rejection. The entries are based on R = 1, 000 rephcations. The 
table shows that the test based on convergence (15.11) has correct empirical size at the 
5% level and is a bit too conservative at the 10% level. However even at the 10% level 
the empirical sizes for d > 3 are not significantly different; they all fall into each others 
90% confidence intervals. This illustrates the main point that for the tests that use 
the asymptotics with — )■ oo developed in the paper, selecting d is not essential; every 
sufficiently large d gives the same conclusion on the significance. 

The empirical power of the test is reported in Table 15.31 Again, for d > 3, the power 
remains statistically the same. We note that the change in mean equal to the function 
at{l — t) with a = 1.5 is fairly small if the "noise curves" are Brownian motions. This is 
illustrated in Figure [3] which shows 20 Brownian motions in the left panel and another 
independent sample of 20 Brownian motions with the curve at{l — t),a = 1.5 added. If 
one knows that this curve was added, one can discern it in the plot in the right panel, 
but the difference would have been much less obvious if individual curves were observed, 
as in the change-point setting relevant to Table 15.31 

Regarding Corollary 13. H we found out that the test based on convergence (13. 3 p has 
empirical size only slightly higher than nominal (about 1% at 5% level). For d > 3, the 
empirical size does not depend on d. The test based on (13. 4p severely overrejects for 
N = 100, and we do not recommend it. The test based on Theorem 14.11 overrejects by 
about 2% at the 5% level, and by about 1% at the 10% level. The power of the test is 
above 95% for N,M = 100 and a = 1.0, and practically 100% for larger a or A^, M. For 
d > 2, the rejection probabilities do not depend on d. 



Change— point analysis of annual temperature profiles. The goal of this section is 
to illustrate the application of the change-point test based on convergence (15. ip . Change- 
point analysis is an impo rtant field of stat i stics w ith a large number o f app lications, 
the recent monographs of IChen and Guptal (l201ll ) and iBasseville et al.l (l2012[ l provide 
numerous references. The change-poin t problem in the context of f unctional data has 
also received sorn e atten tion, we refer to lHorvath and Kokoszkal ( 12012| ) for the references. 



Aston and KirchI ( 12012| ) report some most recent research. 



The data set we study consists of 156 years (1856-2011) of minimum daily temperatures 
in Melbourne. These data are available at [www . bom . go v . aii] (the Australian Bureau of 
Meteorology website). The original data can be viewed as 156 curves with 365 measure- 
ments on each curve. We converted them to functional objects in R using 49 Fourier 
basis functions. Five consecutive functions are shown in Figure |H It is important to 
emphasize the difference betw een the data we use and the C ana dian temperatu r e dat a 
made popular by the books of iRamsay and Silverman! (120051 ) and iRamsay et al.l (120091 ). 
The Canadian temperature curves are the curves at 35 locations in Canada obtained 
by averaging annual temperature over forty years. Since each such curve is an average 
of forty curves like those shown in Figure HI those curves are much smoother, and the 
first two FPC's are sufficient to describe their variability. Even after smoothing with 49 
Fourier functions, the annual temperature curves exhibit noticeable year to year vari- 
ability, and a larger number of FPC's is needed to capture it, see Table [531 The goals of 
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Figure 4. Five annual temperature curves represented as functional objects. 



our analysis are also different from those of iRamsay and Silverman! (120051 ). We are in- 
terested in d etecting a change in the rnean function using a sequence of noisy curves; the 
examples in iRamsay and Silverman! (120051 ) used the averaged curves to describe static 
regression type dependencies between climatic variables. 

The analysis proceeds through the usual binary segmentation procedure. The test is first 
applied to the whole data set. If the P-value is small, the change-point is estimated as 



On = inf{/c : lN{k) = sup InU)}, 

i<i<JV 



where 



i=l \j=l 



N-i 
N V 




{In is a discretization of Z^-) The test is then applied to the two segments, and the 
procedure continues until no change-points are detected. In practice, a procedure of this 
type detects only a few change-points (four in our case), so the problems of multiple 
testing are not an issue. We applied the test using many values of d, and we were pleased 
to see that the final segmentation does not depend on d. Table 15.51 shows the outcome. 
The estimated change-points are the years 1892, 1960, 1967, 1996. It is clear that the 
change-point model is not an exact climatological model for the evolution of annua l 
temperature curves, but it is popular in climate studies, see e.g. iGallagher et al.l ( l2012l ). 
as it allows us to attach statistical significance to conclusions and provides periods of 
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Figure 5. Average temperature functions in the estimated partition segments. 

approximately constant mean temperature profiles. In this light, the weak evidence for a 
change-point in 1967 could be viewed as indicating an accelerated change in the period 
1960-1995. The estimated mean temperature curves over the segments of approximately 
constant mean are shown in Figure [51 An increasing pattern of the mean temperature 
is seen; the mean curve shifted upwards by about two degrees Celsius over the last 
150 years. This could be due to the conjectured global temperature increase or the 
urbanization of the Melbourne area, or a combination of both. A discussion of such 
issues is however beyond the intended scope of this paper. 



18 STEFAN FREMDT, LAJOS HORVAtH, PIOTR KOKOSZKA, AND JOSEF G. STEINEBACH 



k 


1 


2 


3 


4 


5 


6 


7 


8 


Xk 

u 


0.7151 
0.2248 


0.1469 
0.2711 


0.1295 
0.3118 


0.1154 
0.3480 


0.1046 
0.3809 


0.1021 
0.4130 


0.0944 
0.4427 


0.0868 
0.4700 


k 


9 


10 


11 


12 


13 


14 


15 


16 


u 


0.0845 
0.4966 


0.0833 
0.5228 


0.0758 
0.5466 


0.0732 
0.5696 


0.0726 
0.5925 


0.0687 
0.6141 


0.0661 
0.6349 


0.0641 
0.6550 


k 


17 


18 


19 


20 


21 


22 


23 


24 


u 


0.0620 
0.6745 


0.0586 
0.6930 


0.0559 
0.7105 


0.0559 
0.7281 


0.0534 
0.7449 


0.0508 
0.7609 


0.0472 
0.7757 


0.0463 
0.7903 


k 


25 


26 


27 


28 


29 


30 


31 


32 


u 


0.0440 
0.8041 


0.0427 
0.8175 


0.0426 
0.8309 


0.0400 
0.8435 


0.0377 
0.8553 


0.0367 
0.8669 


0.0359 
0.8782 


0.0325 
0.8884 


k 


33 


34 


35 


36 


37 


38 


39 


40 


u 


0.0320 
0.8985 


0.0299 
0.9079 


0.0281 
0.9167 


0.0274 
0.9253 


0.0252 
0.9332 


0.0248 
0.9410 


0.0228 
0.9482 


0.0211 
0.9548 


k 


41 


42 


43 


44 


45 


46 


47 


48 


Xk 

u 


0.0207 
0.9614 


0.0201 
0.9677 


0.0188 
0.9736 


0.0171 
0.9790 


0.0166 
0.9842 


0.0163 
0.9893 


0.0129 
0.9934 


0.0114 
0.9969 



Table 5.4. Eigenvalues and percentage of variance explained by the first 
k eigenvalues, i.e. fk = ^^^=1 k/ Y!j=i Aj, for /c = 1, 2, . . . , 49. 
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0.0026 
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7 
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8 
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0.1245 
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9 
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0.8243 
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0.9618 


0.9779 



Table 5.5. Segmentation procedure of the data into periods with con- 
stant mean function 
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Appendix A. Proof of Theorem 12.11 

We start with some elementary properties of the projections ^ij. Let | ■ | denote the 
Euclidean norm of vectors. 

Lemma A.l. If Assumptions lKll \2.3\ and \2.4\ hold, then 
(A.l) E^, = 0, 



(A.2) E^,^^ = I, 

where Id is the d x d identity matrix. Moreover, 



3/2 



(A.3) 

and for all 1 < j < d 
(A.4) 



Proof. Since EZi{t) = 0, the relation in ( lA.lj) is obvious. The orthonormal functions Vk 
and Vi satisfy (12. 2p . so we get 



ECi,kCi. 



c(t, s)vk{s)ve{s)dtds 



0, if A; 7^ 

1, if A; = 



provmg ([A2|) . Using the definit ion of the Euclidean norm and the Cauchy-Schwarz 
inequality we conclude 

(d \ 3/2 / \ 3/2 / \ ^^'^ 

E(^i'-.-)VA.j <(J:\\z^r\\v,r/x,j =iizif (^5:i/A,j , 

since \\vj\\ = 1. Taking the expected value of the equation above we obtain (]A.3|) . Clearly, 



= Xf'E\{Z,,v,)\' < Xf'E\\Z,f. 



The next lemma plays a central role in the proof of Theorem 12.11 



□ 



Lemma A.2. // Assumptions l2A\ \2.3{ and \2.4\ hold, then for all n we can define inde- 
pendent identically distributed standard normal vectors 7^^, . . . ,7„ in such that 



I 



i=l 1=1 

where c is an absolute constant. 



> cn'/'d'/\E\^,f + E\j,fY/^ \ < cn-'/'d'/\E\^,\'' + E\j,fY 



/4 



Proof. The result is a conse quence of Theorem 6.4.1 on p. 207 of ISenatovl (119981 ) and the 
corollary to Theorem 11 in IStrassenI (119651 ). □ 
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We note that 

(A.5) mil' + Eh,\r^' < mii'Y^' + {Eh.i'y/'. 

Also, since is the sum of the squares of d independent standard normal random 
variables, Minkowski's inequality implies 

(A.6) £;|7i|^ < ci£^^, 

with some constant Ci, and clearly 

3/2 

(A.7) 



Combining Lemma [A. 21 with (lA.Sjl -f lATTl) . we conclude that 



(A.8) P< 



1=1 i=l 



3/8- 



3/8 



where C2 does not depend on d. 

In the next lemma we provide an upper bound for the variance of Y17i^i,j ~ li,j)-> where 
li = (7j,i; • ■ • ' li,d)^ is defined in Lemma IA.2I 

Lemma A. 3. // Assumptions IKli \2.3i and\2.4\ hold, then for any 1 < j < d we get 



23/24 



, i=l i=l 

where C3 does not depend on d. 
Proof. Let 



3/8N 



1/3 



3/8 



Un{j) = n-'/' - 7,,,) and r„ = c^n-'^'d'/'' Yl 



4 = 1 



First we write 



EU'^ij) = E[U'^ij)I{\a^ij)\ < rj] + E[Ul{j)I{\U^ij)\ > rj] 



<n + -E 

n 



i=l 



1e 

n 



i=l 



Using Holder's inequality we get that 



E 



i=l 



I{\Unm>rn} 



< E 



< E 



i=l 
n 



2/3 



2/3 



P{\UnU)\>rn} 



1/3 



a/3 



i=l 
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by (lA.Sp . Applying now Rosenthal's inequality (cf. iPetrovl (119951 ) . p. 59) we obtain 



E 



n Z ( n / n \ 3/2 >v 

i=l ^ i=l ^i=l ^ ^ 



where C4 is an absolute constant. Hence 

n 3 



E 



i=l 



<c,{nXf' + n'/^}<c,{n/\^f^ 



and therefore 



E 



1=1 



3/8n 1/3 



Following the previous arguments one can show that 



E 



3/8n 1/3 



The constants cg and cg do not depend on d. Since in view of Assumption I3.3[ nr"^ is 
smaller than the latter rates, this completes the proof of Lemma IA.3I □ 

Proof of Theorem 12.11 We use a blocking argument to construct a Wiener process 
which is close to the partial sums Yli<i<k^i-j'^ "£ k < N,l < j < d. Let K be the 
length of the blocks to be chosen laterr Let M = [N/K\ . For k = iM, 1 < i < K we 
write 

k I , vM X 

e&.=e( e 

i=l v=l \={v-l)M+l ' 

Using the 7jj's, the independent standard normal random variables constructed in 
Lemma [A. 2t we define 

k 

(A.9) Wj{k) = l<j<d, l<k<N. 

i=l 

By Lemma [A.3I w e get for any < 5 < 1/2 and 1 < j < d via Kolmogorov's inequality 
fcf. IPetrovl fll995h ). p. 54) 

iM 



(A.IO) 



Pi max 

i<e<K 



i=l 



> iVl/2-5 



Pi max 

i<e<K 



vM 



< 



v=l \=(v-l)M+l 
^ K . vM X 2 

v=l \=(i)-l)M+l ^ 



> iVl/2-5 
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1 / / <i \ 3/8x 1/3 



One can define independent Wiener processes (standard Brownian motions) Wj[x\x > 
0,1 < j < d such that (1A.9I) holds. We obtained approximations for the partial sums 
of the ^ij's at the points k = iM, 1 < i < K. Next we show that neither the partial 
sums of the ^i,/s nor the Wiener processes Wj{x) can oscillate too much between iM 
and (£ + 1)M. 

Using again Rosenthal's inequality (cf. iPetrovi (1l995[ l. p. 59) we obtain for all 1 < j < d 

that 



(A.ll) 



E 



ivi a ^ ivi , ivi ^ 3/2 -X 

i=l ^ %=\ ^i=l ^ 

<Cn{M/Af' + M=^/2| 
<cn(l + Af)(M/A,)^/2 



on acc ount of Lemma I A.ll Combining the Marcinkiewicz-Zygmund inequality (cf . iPetrov 
fll995[ ). p. 82) with f OTTTj) we conclude 



(A.12) 

Applying OA. 121) we get 



E\ max 

\<h<M 



Ea.)'< ci2(M/A,)=^/^ 
i=i ^ 



(A.13) 



P < max max 

0<f<A'+l l<h<M 



IM 



m+h 



i=l 



4 = 1 



> Ari/2-5 



< (K + 2)P<^ max 

' l<h<M 



< 



A'(A//A^)'''"^ 



i=l 



> 



iVl/2- 



jY3/2-35 



Lemma 1.2.1 of ICsorgo and Revesa (1l98ll ) yields 

(A. 14) P I max sup | Wj (iM) - Wj {iM + h)\ > c^M^^^ (log N) \ < 



1/2 \ ^ '^15 

Ar2- 



Now choosing 5 = 1/80 and K = [N^\ with /3 = 1/10, it follows from ( lAlQj) . ( lAiaj) 
and (lAAil) for all 1 < j < that 



(A.15) 



P-{ sup 

0<2/<Af 



> 



^1/2-5 
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The result now follows from f lAlSj) with Wj,t^{x) = N-^/^Wj{Nx), < x < 1. □ 

Appendix B. Proofs of the results of Section [3] 

We first investigate the weak convergence of the process 

Zn{u,x) = -jYJ^ ^ {{Sj^N{x) - xSj^Nil)Y - x(l - x)} , <u,x < 1, 
i=i 

with Sj^N^x) given by (12 ■4p . The difference between Zn{u,x) and Zn{u,x) is that Z^v is 
computed from the empirical projections Vi, . . . ,Vd, while Zj^ is based on the unknown 
population eigenfunctions f i, . . . , f^. 

Theorem B.l. If AssumpUons\2Jl\2^\2^and\3l^\3^hold, then 

Zn{u,x) — )■ r('U,x) in ©[0,1]^, 
where the Gaussian process T{u,x) is defined in Theorem \3 . 1[ 

To prove Theorem IB.lt we need several lemmas and some additional notation. 
Let 

yj,N{x) = Sj^N^x) — xSj^N^^) and Bj^n^x) = Wj^N{x) — xlVj^N^^), 
where Sj^n is defined in (12. 4 p and the Wj^n's are the Wiener processes of Theorem 12. II It 
follows from the definition that for each the processes Bj^n, 1 < j < d, are independent 
Brownian bridges. 

Lemma B.l. // Assumptions lKTi \2.3{ and \2.4\ hold, then 

r ^ 

P\ sup ^|\A2^(x)-52^(a;)|>20rfiV-i/^°(logiV)i/2 



0<3;<1'. , 
j=l 



L / j=i J 



|3 



where and c^,* only depend on Ai and E\\Zi\ 
Proof. First we write 

V^^ix) - Bl^{x) = {V,^n{x) - B,^M{x)f + 2Bj^M{x){Vj,N{x) - B.j^m{x)). 

Since the Bj^^^s are Bro wnian bridges, the distrib ution of the supremum functional of 
the Brownian bridge (cf. ICsorgo and Reveszl (jl98l[ )) gives 

p| max sup \Bj^Nix)\>A{\ogNy^A<c,,^, 
l^<j<<i o<x<i ) ly 

where c*^< IS an absolute constant. Now the result follows immediately from Theorem 

o □ 
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Now we prove the weak convergence of the partial sums of the squares of independent 
Brownian bridges. Let Bi, B2, ■ ■ ■ , he independent Brownian bridges. 



Lemma B.2. As d ^ 00, we have that 



X 



(1 -x)) -> T{u,x) in V[0,lf 



where the Gaussian process r{u,x) is defined in Theorem \3.1[ 



Proof. The proof is based on Theorem 2 of iHahru (119781 ) . Let B denote a Brownian 
bridge and_^ = supo<t<i It is clear that EO'^ < 00 for all m > 1. According to 



Garsial (Il970[ ). there is a random variable 62 such that £'6'™ < 00 for all m > 1 and 

\B{t) - B{s)\ < d2{\t - s\ log(l/|t - s\)Y^\ < t, s < L 
Let V{t) = B\t) - t{l - t). We note 

\V{t) - V{s)\ < 2M2(|t - s\ log(l/|t - + 1^ _ 

Thus we get 

(B.l) E{V{t) - V{s)y < cielt - s| log(l/|t - s|) for all < t, s < 1 



and 
(B.2) 



E[{Vit) - Viz)nV{z) - Vis))'] < cui\t - s\ log(l/|t - .1))^ 



for all < s < z < t < 1. The estimates in (IB.ip and ( 1B.2P yield that the conditions of 
Theorem 2 of iHahnl (119781 ) are satisfied, completing the proof Lemma [B.2I □ 



Proof of Theorem IB. 11 It follows immediately from Lemmas IB. II and IB. 21 



□ 



The transition from Theorem IB. II to Theorem 13.11 is based on the following lemma, in 
which the norm is the Hilbert-Schmidt norm. 



Lemma B.3. If Assumptions lKT\ \2.S\ and \2.3\ hold, then 
(B.3) 

and 



I A, 



A,| < ||c 



(B.4) 



CjVjW < 



c — c 



where Cj = sign(('Oj, Vj)) are random signs, and Ci, (2, ■ ■ ■ are defined in Assumption \3.5[ 

P roof. Inequality (IB. 3D can b e deduced from the gene r al res ults presented in Section VI. 1 
of iGohberg et al.l (Il990[ ) or in lDunford and Schwartz! (Il988[). These res ults are presented 
in a c onvenient form in Lemnia 2.2 in iHorvath and Kokoszkal (|2012[ ) . Finally Lemma 
2.3 in iHorvath and Kokoszkal ( 120121 ) gives ( iRil) . □ 
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Proof of Theorem 13.11 Introducing 

i=l 1=1 



we can write 



Zn{u, x) 



1 ^'^''^ r 1 



Ar(x), -Oj)^ — x(l — x) 



Elementary arguments give 

\du\ \du 



[du\ \du\ [d«J , 



1 1 

a" 



[du\ 



1 ^1 

By the Cauchy-Schwarz inequahty we have 



+ E Y.^(Un{x), v,f - {Un{x), cjv.y 



(B.5) 



j=i ] J j=i 



j=i -^j-^i 



and since — = |a + 6| |a — 6|, 



It follows from the results of iKuelbsl ( 1l973l ) (for a shorter proof we refer to Theorem 6.3 



m 



Horvath and Kokoszkal (120121 )) that 



sup \\UNix)f = Op{l). 

0<x<l 

Due to Assumption 12.41 we can use a Marcinkiewicz-Zygmund type law of large numbers 
for sums of independent and id entically distributed rand om functions in Banach spaces 
(cf., e.g.. IWoyczynskil (119781 ) or iHowell and Tavloii dlQSOh ) to conclude 

||c- c|| = Op{N-^/^). 

Assumption 13.41 gives that N~^^™ / Xd — ?■ and therefore by Lemma [B. 31 

A,; 



max — = Op(l). 

i<i<dx 



So by Lemma IB. 31 and ( IB. 51) we have 

d 



j = l 3 i=l 



A"/3 A3 
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=Op(l) 

on account of Assumptions 13.21 and I3.4[ Similarly, ( IB. 61) and Assumption 13.51 yield 

J=l J j=l J 

Theorem 13.11 now follows from Theorem IB.ll □ 

Proof of Corollary 13.11 By Lemma [B. II and ( IB. 71) . relation ( 13. 2 p is proven if we show 
that 

(B.8) ^_|^ sup i?,2(a;)-ci«o| ^ N{0,1), 

where Bi, B2, ■ ■ ■ , Bd are independent Brownian bridges. Clearly, ( IB. 811 is an immediate 
consequence of the central limit theorem. Similarly, to establish (13.31) . we need to show 
only that 



^ Ar(0,l). 



The above result is known, see Remark 2.1 in lAue et al.l (120091 ). The same argument 
can be used to prove ( 13. 4p . □ 

Appendix C. Proofs of the results of Section [4] 

We note that under the null hypothesis Xn — Ym = Zn — Qm- Define 

TV ^ M 

Fn,m = XI ~ m 5Z '^r 

The proof of Theorem 14.11 is based on Lemma IA.21 we need to write -Fat^m as a single 
sum of independent identically distributed random processes and an additional small 
remainder term. Let K be an integer and define the integers R = \_N/K\ and L = 
[M/K\ . Next we define 

iR iL 

A= E M^^^ z = l,2,...,K. 

£=R(j-l) + l £=L(j_l)_|_l 



Clearly, 



where 



K 

'N,M 



i=l 



N M 
e=KR+l l=KL+l 



28 STEFAN FREMDT, LAJOS HORVAtH, PIOTR KOKOSZKA, AND JOSEF G. STEINEBACH 



We will show first if f is a function with ||f || = 1, then for every n 

3 



(C.l) 

and 

(C.2) 



E 



E 



< CiU 



3/2 



< 



3/2 



where Ci and ft only depen ds on and respectively. Using Rosenthal's 

inequality (cf. iPetrovl ( 119951 ) . p. 59) we get 

n 3 



E 



<cs{nE\{Z,,v)f + {nE{Z,,v)Y'} 



where C3 is an absolute constant. It is easy to see that 

\{Z,,v)\ < llZill, 

which implies (IC.1|) . The same argument can be used to prove flC.2p . 



Next we define the function 



C7V,Af(t, S) = C{t, S) + ]^^C*(t, S). 

It is clear that c^^m is a covariance function and therefore we can find hi = k,i{N, M) > 
R2 = K2{N, M) > . . . and orthonormal functions Ui{t) = ui{N, M),U2{t) = U2{N, M), . . . 
satisfying 

RiUi{t) = j CN,M{'t, s)ui{s)ds, 1 < i < 00. 

Now we define the vector 

= {{A, ui)/{RRiY/\ {A, U2)l{RR2f'\ . . . , {A, Ud)l{RRdY'Y, l<i<K. 

It is easy to see that if)j^, 1 < i < K, are independent and identically distributed random 
vectors with mean and Eil^ii})^ = 1^, where 1^ is the dx d identity matrix. Also, (1C.1I) 
and (^CM imply that 

/ d N 3/2 

E\^pl\ < C4 

where C4 only depends on i?||Zi||3 and Using Lemma [A. 21 we obtain similarly to 

(lA.Sp that there are independent standard normal random vectors 7^ = 7j(A^, M), 1 < 
i < i^, in such that 

K K , d \ 3/8>, 

(C.3) p Y.^.-Y.^.>c,K^i'd^''\y^^i-^A 



3/8 
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where C5 does not depend on d. Let 

It follows from ( IC.ll) and (IC.2I) that with some constant ce, not depending on d we have 

d \ 3/2 



and therefore by Markov's inequality for every x > 

^3/2 



(C.4) 
Let 



3/2 



3jY3/2 



^=1 



Next we choose /sT = [N'^l^\ in (JOS]), ([031) and x = i^"^/^(Eti V^^)^^^ in (Q to 
conclude that there is 7^? ^, a standard normal random vector in i?'' such that 

3/8 



(C.5) 



P< 



1 



--I^N,M - 1n,M 



3/8 



where A^* = [A^/ [A^3/4j j |^]y3/4j _ Ugjng the definitions of Cp and c^v m, together with 
Assumption 14. 3[ we conclude 

(C.6) \\cp-cn,m\\ = 0{N-"^), 

by Lemma 2.3 of Horvath and Kokoszka ( 20121 ). cf. Lemma [B. 31 we have 



so 



(C.7) 



\Ki - Ki\ < Cg ||Cp - Civ,Af|| = 0{N ^/^). 



Using Assumption 14.51 we conclude that 



e=i \e=i / 



Hence it follows from (IC.SP and Assumption 14.51 that 



1/2^ 



N 



Since I^ata/P is a random variable with d degrees of freedom, Assumption 14.51 yields 
that 

' N* 



N 



1 



l7Af,Ml = Op 



(rfV2). 
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It is well known that (|7jvmP~'^)/(2c^)^^^ converges in distribution to a standard normal 
random variable, and therefore 

^ Ii^nM'-A 4 iV(o,i), 



V2d [N' 

where A^(0, 1) stands for a standard normal random variable. 

The difference between \kn,m\'^/N and Dn,m is that the projections are done into the 
direction of different functions (wi's and Mi's, respectively) and the normalizations (kj's 
and Kj's, respectively) are also different. However, using the Marcinkiewicz-Zygmund 
law of large numbers in a Banach space together with ( ]C.6[) and Assumption 14.51 we 
obtain that 

\\ip-CN,M\\ = Op{N-'/'). 

Hence, in view of (lC.7p . also 

sup \ki — Ri\ = Op{N^^^'^), 

i 

and there are random signs di such that 



-1 

sup 



5^1A, \\u,-lu,\\=Op{N-^'^) 



=1 



So repeating the arguments used in the proof of Theorem 13.11 we get 



Dn,M — J^\l^N,M 



op{d 



1/2N 



completing the proof. 
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